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Abstract 

Background and Objectives: Randomization, allocation concealment, and blind outcome assessment have been shown to 
reduce bias in human studies. Authors from the Collaborative Approach to Meta Analysis and Review of Animal Data from 
Experimental Studies (CAMARADES) collaboration recently found that these features protect against bias in animal stroke 
studies. We extended the scope the work from CAMARADES to include investigations of treatments for any condition. 

Methods: We conducted an overview of systematic reviews. We searched Medline and Embase for systematic reviews of 
animal studies testing any intervention (against any control) and we included any disease area and outcome. We included 
reviews comparing randomized versus not randomized (but otherwise controlled), concealed versus unconcealed treatment 
allocation, or blinded versus unblinded outcome assessment. 

Results: Thirty-one systematic reviews met our inclusion criteria: 20 investigated treatments for experimental stroke, 4 
reviews investigated treatments for spinal cord diseases, while 1 review each investigated treatments for bone cancer, 
intracerebral hemorrhage, glioma, multiple sclerosis, Parkinson's disease, and treatments used in emergency medicine. In 
our sample 29% of studies reported randomization, 15% of studies reported allocation concealment, and 35% of studies 
reported blinded outcome assessment. We pooled the results in a meta-analysis, and in our primary analysis found that 
failure to randomize significantly increased effect sizes, whereas allocation concealment and blinding did not. In our 
secondary analyses we found that randomization, allocation concealment, and blinding reduced effect sizes, especially 
where outcomes were subjective. 

Conclusions: Our study demonstrates the need for randomization, allocation concealment, and blind outcome assessment 
in animal research across a wide range of outcomes and disease areas. Since human studies are often justified based on 
results from animal studies, our results suggest that unduly biased animal studies should not be allowed to constitute part 
of the rationale for human trials. 
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Introduction 

Bias in Animal Studies 

Clinical epidemiologists and proponents of evidence-based 
medicine (EBM) have been using methods to reduce bias in 
human studies for over four decades. [1-5] Random allocation of 
participants to treatment groups, concealing the allocation 
sequence from those assigning participants to intervention groups 
(allocation concealment), and blinding of investigators assessing 
outcomes are now viewed as fundamental ways of ensuring quality 
and minimizing bias in clinical trials. [6] This is because concealed 
random allocation reduces selection bias and blinding outcome 
assessors reduces detection bias. [5] Armed with these methods, 
researchers have exposed several common medical practices as 
ineffective. For example, observational studies led us to believe 



that sodium fluoride reduced vertebral fractures, [7] that vitamin 
E reduced major coronary events, [8] and that high-dose aspirin 
was more effective than low-dose aspirin. [9] But subsequent 
randomized trials exposed all these treatments as useless or 
harmful. [10,11] Benefits of randomization, allocation conceal- 
ment, and blinding have been confirmed in larger meta- 
epidemiological studies. In the earliest of these, Schulz et al. 
(1995) found that odds ratios were exaggerated by 30% in trials 
lacking allocation concealment and by 17% in studies that lacked 
blind outcome assessment. [12] Subsequent larger investigations 
have confirmed these results and also shown that adequate 
randomization reduces bias in human studies. [13,14] 

A growing body of evidence is beginning to suggest that 
randomization, allocation concealment, and blinding outcome 
assessment can also reduce the risk of bias of animal studies. [15- 
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25] Some researchers hypothesize that avoidable biases in animal 
studies contribute to the failure to translate much experimental 
work for human benefit. [26,27] For example, while 503 of 835 
candidate drugs for use in the management of stroke appeared 
effective in animal models, only one (tissue plasminogen activator) 
has proved sufficiently efficacious in humans. [28] 

Much research into the empirical dimensions of bias in animal 
studies has been conducted by investigators from the Collaborative 
Approach to Meta Analysis and Review of Animal Data from 
Experimental Studies (CAMARADES) group. [29] CAMAR- 
ADES researchers recently conducted an overview of systematic 
reviews of animal studies researching treatments for experimental 
stroke, and showed that failure to conceal allocation (but not 
failure to randomize or blind) exaggerated apparent treatment 
benefits in animal studies. [30] Despite this research, evidence- 
based principles have not yet been widely adopted in animal 
research; a recent study showed that only one in six controlled 
animal studies use randomization and only one in five use blind 
outcome assessment [31]. We therefore aimed to replicate the 
CAMARADES study independently and to expand its scope to 
include all conditions. 

Methods 

We conducted an overview of systematic reviews. The protocol 
(unpublished) was finalized byJH, CH, RP, and JA in October 
2012. We modified the protocol once to add the secondary 
analysis (testing the "unpredictability paradox"; see below). We 
searched MEDLINE and Embase databases (19 April 2012) and 
scanned reference lists for systematic reviews of animal studies that 
measured effects of randomization, allocation concealment, or 
blinding of outcome assessment. We included reviews in any 
disease area, using any intervention, any control group, any 
outcome measure and any animal model. We limited our search to 
the last 20 years and excluded human studies (search strategy in 
Appendix SI). We also excluded conference papers, studies not 
reported in English, ecological studies, and epidemiological 
studies. 

Two reviewers (JH and JAH) independently extracted data on 
numbers of studies, numbers of animals, disease/condition, 
outcomes, effect measures, and effect sizes with confidence 
intervals, using piloted data extraction forms. Disagreements were 
resolved by discussion with other authors. Authors were contacted 
to request data which were not reported. To enable inclusion of 
one review [32] we estimated the number of animals in 
randomized and non-randomized groups by calculating the mean 
number of animals per study. To test whether this estimation 
affected our results we carried out a sensitivity analysis by 
removing the study from the meta-analysis. We assessed the risk of 
bias of included systematic reviews using the Assessment of 
Multiple Systematic Reviews (AMSTAR) criteria. [33] 

We pooled results using the DerSimonian and Laird random 
effects model. [34] We reported outcomes for which differences 
between randomization/no randomization, allocation conceal- 
ment/ no allocation concealment, and blinding/ no blinding were 
reported. We combined different outcomes and measurement 
units using standardized mean differences (SMDs), and quantified 
heterogeneity using the I-squared statistic. [35] We used meta- 
regression in a post-hoc analysis to examine whether various 
features influenced outcomes. Specifically, we investigated whether 
study size, disease state (stroke versus all other outcomes), or 
outcome measure were significantly associated with the effect size 
or could explain some of the heterogeneity. 



For our secondary analysis we investigated the "unpredictability 
paradox", which was proposed in a similar study involving human 
subjects. [13] The paradox states that the difference between 
inadequately randomized and randomized studies, although real, 
is unpredictable in terms of direction. This is plausible, given that 
the direction of bias may relate to differences in expected results. 
To investigate the paradox we ignored direction to see whether 
there was an absolute difference between results in randomized 
and non-randomized studies. We used the same method to 
investigate the unpredictability paradox for adequate allocation 
concealment and blinding. This approach is useful only as a guide, 
since with a large enough sample some absolute difference is likely 
to arise by chance alone. 

Results 

We identified 238 articles from our electronic search, and a 
further 24 articles by hand searching references and contacting 
CAMARADES authors. Two authors (JH, JAH) excluded 199 
articles after reading titles and abstracts. We assessed the full text 
of the remaining 63 articles and excluded a further 32 for not 
including outcome data. CAMARADES authors generously 
shared data from 19 reviews in which data were not included in 
the published reports. We were left with 31 systematic reviews 
involving 7339 comparisons (estimated 123,437 animals) to 
include in the meta-analysis (see Figure 1). Characteristics of the 
31 included reviews are shown in Table 1, and our data are 
available freely from the authors. 

Twenty systematic reviews investigated treatments for experi- 
mental stroke, [17-20,24,28,32,36-47] four reviews investigated 
treatments for spinal cord diseases, [48-51] one review each 
investigated treatments for bone cancer, [52] intracerebral 
hemorrhage, [39] glioma, [53] multiple sclerosis, [54] Parkinson's 
disease, [55] and any treatments used in emergency medicine. 
Animal types included baboons, cats, dogs, ewes, gerbils, guinea 
pigs, lambs, marmosets, mice, monkeys, pigs, rabbits, rats, and 
sheep. In our sample 29% of studies reported randomization, 15% 
reported allocation concealment, and 35% reported blinded 
outcome assessment. 

1. Randomization 

Thirty reviews with 7249 comparisons (121,784 animals) 
reported the effects of randomization. Randomized trials reduced 
effect sizes by a moderate and statistically significant amount 
(SMD = -0.07,95%CI-0.12to-0.02,I 2 = 89.1%,P= 0.008) 
(Figure 2). In a subgroup analysis examining the effect of 
randomization by disease (stroke versus other), we found that 
randomization resulted in a lower effect size in areas other than 
stroke (SMD -0.18, 95% CI -0.30 to -0.06) but not stroke itself 
(SMD -0.03 95% CI -0.08 to 0.02). However, using meta- 
regression we found no significant difference between stroke and 
non-stroke on outcome measures (P = 0.08); additionally, meta- 
regression could not explain more than 3% of the heterogeneity. A 
sensitivity analysis excluding the single review [32] in which we 
had to estimate the number of animals, did not alter the overall 
result (SMD = -0.08 95% CI -0.13 to -0.03). In our secondary 
analysis (where we ignored direction of effect) we found a larger 
difference between randomized and non-randomized studies 
(SMD -0.16, 95% CI -0.21 to -0.11, I 2 = 86.6%, P<0.0001) 
compared with the effect size in which we took direction into 
consideration. 
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63 full text records 
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empirical dimensions of bias 



31 studies included in 
meta-analysis 



Figure 1. Flowchart of identified and included studies. 

doi:1 0.1 371 /journal.pone.0098856.g001 



2. Allocation concealment 

Eighteen reviews with 2696 comparisons (39,405 animals) 
reported the effect of allocation concealment. Studies in which 
allocation concealment was used resulted in slightly decreased 
effect sizes, but this was not statistically significant (SMD = — 
0.04, 95% CI -0.09 to 0.00, I 2 = 51.6%, P= 0.059) (Figure 3). 
Subgroup analysis examining different diseases (stroke and non- 
stroke) showed that allocation concealment in studies of stroke 
resulted in significantly lower effect sizes (SMD = —0.07, 95% CI 
-0.12 to -0.02, I 2 = 48.5%, P= 0.009), whereas aUocation 
concealment in other disease areas resulted in higher effect sizes 
(SMD 0.05, 95% CI -0.01 to 0.11, I 2 = 0%, P= 0.128) but the 
difference between these groups was not found to be significant 
using meta-regression (P = 0.073). Meta-regression of the combi- 
nation of disease and outcome measure was did not explain more 
than 9% of the heterogeneity. In our secondary analysis (where we 
ignored direction of effect) we found a larger difference between 
concealed and non-concealed studies (SMD —0.08, 95% CI — 
0.11 to -0.05, I 2 = 13.8%, P<0.0001) compared with the effect 
size in which we took direction into consideration. 

3. Blinding 

Twenty-eight reviews involving 7140 comparisons (119,597 
animals) reported the effects of blinding of outcome assessment. 
Effect sizes in studies that involved blind outcome assessment were 
not significantly different from studies that did not (SMD = — 
0.01, 95% CI -0.04 to 0.03; I 2 = 68.3%; P= 0.667) (Figure 4). A 
sensitivity analysis excluding one study in which some estimates 
were made did not change results. [16] We did not find any 
differences in effect sizes when we sub-divided studies into stroke 
and non-stroke groups. In a post-hoc subgroup analysis, we 
showed that blinding in studies reporting infarct volume did not 
significantly change effect size (SMD = 0.03, 95% CI -0.02 to 
0.08, P= 0.187)), whereas blinding in those reporting neurobe- 
havioral outcomes did (SMD = -0.06, 95% CI -0.10 to -0.02, 



P— 0.003) and this difference was significant when tested using 
meta-regression (P= 0.014). In our secondary analysis (in which 
effect direction was ignored) we found a larger difference between 
blinded and non-blinded studies (SMD = -0.08; 95% CI -0.11, 
-0.06; I 2 = 49.5%; P < 0.001) compared with the effect size in 
which we took direction into consideration. 

4. Risk of bias 

Using AMSTAR (Table 2), we found a moderate risk of bias. It 
was encouraging that all 31 reviews assessed the quality of 
included studies, all but two reviews used clearly used appropriate 
methods, and all but two reviews performed comprehensive 
literature searches. Yet only 9 studies provided a protocol, and 
only 17 studies searched the grey literature. 

Discussion 

In this overview of systematic reviews we found that failure to 
randomize is likely to result in overestimation of the apparent 
treatment benefits of interventions across a range of disease areas 
and outcome measures. We also found a borderline effect of 
allocation concealment but no overall effect of blinding in our 
primary analysis. We hypothesize that the reason for an effect of 
randomization but not allocation concealment or blinding is that 
subjective judgments are less likely to influence outcomes in trials 
of (relatively homogeneous) animal models compared with 
(relatively heterogeneous) humans. While animal heart rates 
[56], blood flow [57], and behavior can be conditioned by human 
handling so that placebo controls are sometimes also used in 
animal studies, [58] there are no 'patient-reported' (subjective) 
outcomes in animal studies. This may make some measures of 
expectancy effects (for which blinding is useful [5]) smaller in 
animal studies. Our hypothesis is supported by our post hoc 
analyses, which showed that blinding reduced effect sizes for (more 
subjective) neurobehavioral scores, but not for (more objective) 
infarct volume. It may also be relevant that the comparison of 
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Figure 2. Forest plot showing the effect of random allocation on effect size. 
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allocation concealment versus non-allocation concealment was 
reported far less frequently (about half as) as the other 
comparisons, so the failure to find an effect of allocation 
concealment could be due to insufficient power. A future 
individual major study of individual trials is now warranted to 
investigate the direction, magnitude, and conditions that must 
hold for randomization, allocation concealment, and blinding to 
reduce bias in animal studies. 

Our results corroborate those of the CAMARADES study, in 
the sense that we also identified significant bias in animal studies. 
However, whereas they found a borderline effect of allocation 
concealment, but no effect for blinding or randomization, we 
found an effect of randomization, a borderline effect for allocation 
concealment, and no effect for blinding. The differences between 
the two reviews could be because our review covered all disease 



areas, whereas theirs was limited to experimental stroke. In 
addition, our methods were different; we calculated standardized 
mean differences rather than (the less widely used and more 
difficult to replicate) normalized mean differences used by the 
CAMARADES researchers. 

Our study had several potential limitations. First, outcomes, 
animal models, and disease types were heterogeneous. The high 
levels of between-study heterogeneity of our overview could not be 
explained using meta-regression but may result from heterogeneity 
of the included reviews (and it was beyond the scope of our study 
to examine the sources of heterogeneity within our included 
reviews). Secondly, we relied on reports of systematic reviews; 
these, in turn, relied on reports of individual trials. Some trials may 
have failed to report randomization, allocation concealment, and 
blinding when in fact these were used, and vice versa. Evidence 
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Figure 3. Forest plot showing the effect of allocation concealment on effect size. 
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from clinical trials suggests that reporting quality is a good 
surrogate for actual risk of bias. If a similar relationship between 
reporting quality and study quality in animal studies holds, 
incomplete reporting may not have affected our results [59] . Based 
on reporting standards for clinical studies (that require, among 
other things, descriptions of how randomization, concealment, 
and blinding were achieved [60]) reporting standards for animal 
studies have been are emerging. [61] The Animal Research: 
Reporting In Vivo Experiments (ARRIVE) guidelines, developed 
in 2010, [62] arguably constitute the leading candidate for 
becoming a requirement, although development work in this area 
continues [63]. More recently, it has been suggested that until 
formal reporting guidelines become required: "at a minimum, 
authors of grant applications and scientific publications should 
report on randomization, blinding, sample-size estimation, and the 
handling of all data". [61] 

Thirdly, it is unclear whether publication bias may have affected 
our results. It has been estimated that 1 in 6 animal trials remain 
unpublished, [64] so publication bias may have affected our 
results. If we assume that unpublished studies were equally likely to 
be randomized, allocation concealed, and blinded as they were to 
be non-randomized, not adequately concealed, and unblinded, 



then publication bias may not have affected the direction of our 
results. As with human studies, [65] compulsory registration of 
preclinical studies [66] would reduce publication bias and allow 
more precise estimates of the empirical dimensions of bias in 
animal studies. 

Fourthly, many of the individual trials included in the systematic 
reviews applied randomization, allocation concealment, and 
blinding together, whereas we examined these features indepen- 
dently. Of the 31 included reviews, 19 investigated experimental 
stroke. If stroke studies tend to be different from other types of 
studies this might have influenced the results, although we 
explored this using sub-group analysis and meta-regression. 
Fifthly, there were a disproportionate number of stroke studies 
included in out overview of systematic reviews. This was due to the 
fact that stroke researchers have spearheaded empirical investiga- 
tions of bias in animal research. Finally, this study was restricted to 
an investigation of the effects of randomization, allocation 
concealment, and blinding. Other features, such as lack of power, 
publication bias, choice of animal models, choice of sex of animals, 
and choice of outcome may also contribute to the internal and 
external validity of animal studies. [22,31,54,67] A future 
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Figure 4. Forest plot showing the effect of blinding of outcome assessment on effect size. 
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individual study systematic review and meta-analysis is now 
warranted to address these potential limitations. 

Our study has implications that extend beyond the conduct of 
animal studies. Only animal studies that do not suffer from 
avoidable bias should be accepted as justification for human 
studies. For this reason, the United States Food and Drug 
Administration (FDA), [68] the Medical Research Council (MRC) 
in the United Kingdom, [69] and the World Health Organization 
(WHO) [70] insist on fair tests, often involving systematic reviews 
of high quality randomized trials. Our study therefore supports the 
requirement for adequate conduct and reporting of animal studies, 
including those being promoted by CAMARADES, and SABRE 
Research UK. [71] 



Conclusions 

Our overview of systematic reviews and meta-analyses revealed 
that failure to randomize leads to exaggerated effect sizes in animal 
studies across a wide range of disease areas. In our secondary 
analysis we found that failure to conceal allocation or employ blind 
outcome assessment exaggerates effect sizes in animal studies. 
Biased animal research is less likely to provide trustworthy results, 
is less likely to provide a rationale for research that will benefit 
humans, and wastes scarce resources. Requiring compulsory study 
registration and adherence to emerging evidence-based standards 
for the conduct and reporting of animal research is likely to reduce 
the risk of bias in animal studies and improve translatability of 
animal research. 
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